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Abstract 

Adaptive  antenna  arrays  are  widely  used  and  have  great  promise  to  reduce  the  effects  of  interference  and  to  increase  capacity 
in  mobile  communications  systems.  Consider  a  single  cell  system  with  an  (receiving)  antenna  array  at  the  base  station.  The 
usual  algorithms  for  obtaining  the  antenna  weights  for  the  adaptive  array  depend  on  parameters  that  are  held  fixed  no  matter 
what  the  operating  situation,  and  the  performance  can  strongly  depend  on  the  values  of  these  parameters.  For  example,  at 
time  k,  we  might  seek  the  antenna  weights  that  minimize  the  performance  function  i?Xw=i  where  c:/  is  the  error  in 

reception  at  sample  time  l.  Typically,  a  <  1  to  allow  tracking  as  conditions  change.  The  performance  of  the  algorithm  for 
adapting  the  weights  in  the  antenna  array  depends  heavily  on  the  chosen  value  of  the  forgetting  or  discount  factor  a.  Generally, 
the  optimal  value  will  change  rapidly  in  time  as  the  operating  conditions  change.  In  some  cases  (for  example,  where  the 
Doppler  frequency  of  the  mobile  being  tracked  oscillates),  the  optimal  value  of  a  will  also  oscillate.  We  are  concerned  with  the 
adaptive  optimization  of  such  parameters  by  the  addition  of  another  adaptive  loop.  The  antenna  weights  and  the  value  of  a  must 
be  adapted  simultaneously.  We  give  an  algorithm  for  adapting  a,  which  is  based  on  an  approximation  to  a  natural  “gradient 
descent”  method.  The  algorithm  is  practical  and  can  improve  the  operation  considerably.  This  is  justified  via  simulations  under 
a  variety  of  operating  conditions.  The  algorithm  tracks  the  optimal  value  of  a  very  well,  and  always  performs  better  than  the 
algorithm  that  uses  any  fixed  a,  sometimes  much  better.  The  adaptation  can  be  based  on  a  pilot  signal  or  it  can  be  partially  blind. 
The  adaptive  algorithm  for  the  parameter  can  be  analyzed  via  stochastic  approximation  (SA)  theory,  where  the  SA  algorithm  is 
that  for  adapting  a. 

Methods  Keywords:  Stochastic  Processes,  Optimization,  Control  Theory. 

I.  Introduction 

The  adaptive  antenna  problem:  Formulation.  Adaptive  antenna  arrays  are  widely  used  and  have  great  promise  to 
reduce  the  effects  of  interference  and  to  increase  capacity  in  wireless  systems  [3],  [8],  [10],  The  usual  algorithms  for 
adaptive  arrays  depend  on  parameters  that  are  held  fixed  no  matter  what  the  operating  situation,  and  the  performance 
can  strongly  depend  on  the  values  of  these  parameters.  We  are  concerned  with  the  adaptive  optimization  of  such 
parameters  by  the  addition  of  another  adaptive  loop.  It  will  be  seen  that  the  method  is  practical  and  can  improve  the 
operation  considerably. 

We  consider  the  problem  of  optimizing  reception  at  the  base  station  of  a  single  cell  system  with  r  antennas.  The 
updates  of  the  antenna  weights  arc  to  be  done  in  discrete  time,  but  there  arc  natural  continuous  time  analogs.  Let 
Xifi,  i  <  r,  denote  the  complex  (baseband)  output  of  antenna  i  at  measurement  time  k\  it  is  the  sum  of  the  signals 
due  to  all  of  the  mobiles,  plus  additive  noise.  Let  w^k  denote  the  complex  weight  assigned  to  antenna  i  at  time  k. 
Define  the  vectors  =  {xt±-  i  <  r},  Wk  =  {wt±.  i  <  r}.  Let  {s*.}  denote  a  real-valued  pilot  training  sequence 
from  the  particular  mobile  that  is  being  tracked.  It  is  assumed  that  the  training  sequence  is  known  at  the  receiver. 
The  algorithm  to  be  presented  also  works  well  with  partially  blind  adaptation,  or  with  only  periodic  use  of  a  known 
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pilot  signal,  with  blind  adaptation  in  between.  The  weighted  output  of  the  array  is  w*Xik}  =  ^{w^Xk}, 

where  -'ft  denotes  the  real  paid.  Henceforth,  to  simplify  the  notation,  we  concatenate  the  real  (’ft)  and  imaginary  (9) 
components.  Thus,  for  complex  xk,  let  the  unbarred  quantity  xk  denote  the  concatenation  (9ftxfc,$5Xfc)  and  define 
Xk,Wk,  etc ,  analogously. 


The  algorithm  for  adapting  the  weights.  For  a  G  (0, 1]  and  fixed  weight  weight  vector  w,  define  the  discounted 
cost 

k 

Jk(a,w )  =  J2ak  l  l'Sz  _  w'xi\ 2  •  (1-1) 

2=1 


Typically  a  <  1  to  allow  tracking  of  changing  circumstances.  Suppose  that  we  wish  to  minimize  the  performance 
function  EJk(a,  w)  over  w,  for  large  k.  Let  wk  denote  the  value  of  w  that  minimizes  Jfc(cr,  w).  Define  the  errors 
ek{w )  =  Sk  —  w' Xk  and  ek+\  =  sk+ 1  —  w'kXk+ 1-  The  standard  recursive  least  squares  algorithm  for  computing  Wk 
can  be  written  as  [7] 

Wk+l  =  'Wk  +  -kfc+lefc+l! 


Lk+ 1 
Pk+ 1 


PkXk+ 1 

«  +  Xk+\PkXk+l  ’ 

^  p  ,  PkXk+\Xk+lPk 

a[k  a  +  X’k+1PkXk+1 


(1.2) 


where  the  initial  weight  wq  and  matrix  7b  arc  given.  The  discounted  pathwise  cost,  given  the  sequence  of  optimal 
weights  {wi},  is 

k 

Jk(a)  =  Y,*k~lel  (1.3) 

2=1 


The  results  of  simulations  of  this  algorithm  for  mobile  communications  were  reported  in  [10]. 


On  the  value  of  a.  The  value  of  the  performance  function  (equivalently,  the  error  rates)  can  be  quite  sensitive  to 
the  value  of  the  discount  factor  a,  as  will  be  seen  when  the  numerical  data  is  presented  at  the  end  of  the  paper. 
We  arc  concerned  with  the  question  of  its  optimal  value.  If  all  of  the  mobiles  arc  stationary  and  the  variance  of 
the  additive  noise  constant,  then  the  optimal  value  of  a  will  be  unity.  If  the  mobiles  (particularly  the  one  being 
tracked)  arc  moving  rapidly  and  the  additive  noise  level  is  small,  then  the  optimal  value  of  a  will  be  relatively 
small.  In  practice,  the  optimal  value  might  vary  rapidly,  perhaps  changing  significantly  many  times  per  second, 
as  the  operating  conditions  change.  Our  aim  is  the  development  of  a  practical  algorithm  for  adapting  a;  i.e  for 
tracking  its  current  optimal  value.  It  is  based  on  an  intuitively  reasonable  gradient  descent  idea.  The  simulations 
that  arc  presented  in  Section  V  show  the  rapid  response  of  the  adaptive  procedure,  as  well  as  the  impressive  gains  in 
performance  that  can  be  achieved. 


Outline  of  paper.  In  the  next  section  we  give  some  background  material  concerning  an  adaptive  algorithm  for  the 
problem  of  tracking  time-varying  parameters  in  a  linear  system,  via  noisy  observations.  Although  it  will  not  be  used 
in  the  sequel,  the  essential  features  of  that  problem  arc  also  those  of  the  problem  of  interest  here,  since  a  balance 
must  be  found  between  the  averaging  of  the  noise  effects  (i.e.,  large  o)  and  the  ability  to  track  (i.e.,  small  a).  The 
adaptive  procedure  finds  the  proper  balance.  There,  as  in  our  problem,  there  arc  two  levels  of  adaptation.  One 
is  that  which  estimates  the  parameters  (equiv.  our  antenna  weights),  and  the  other  adapts  a  parameter  in  the  first 
adaptive  algorithm  to  optimize  its  performance.  The  success  of  that  time-varying  parameter  tracking  algorithm  was 
the  motivation  for  the  one  used  here,  even  though  the  quantities  being  tracked  arc  different  and  our  current  problem  is 
more  complicated.  In  Section  III,  we  define  the  adaptive  procedure  which  is  to  used  for  our  problem.  Then,  in  Section 
IV,  the  precise  model  that  defines  the  Xi  k  will  be  given.  The  simulation  results  in  Section  V  clearly  demonstrate 
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the  utility  of  the  approach  and  behavior  of  the  algorithm  under  a  variety  of  challenging  operating  conditions,  with 
different  INR  (interference  to  noise)  and  SINR  (signal  to  interference  plus  noise)  ratios.  The  algorithm  is  a  form  of 
stochastic  approximation  with  “state-dependent”  noise  [6].  Its  behavior  can  be  well  approximated  by  the  solution 
of  a  mean  ODE  (ordinary  differential  equation),  and  some  brief  comments  on  the  mathematical  arguments  which 
provide  the  theoretical  justification  that  the  algorithm  is  of  the  gradient  descent  type  arc  at  the  end  of  Section  III. 

II.  Background:  A  Parameter  Tracking  Algorithm 

In  this  section,  we  will  discuss  a  simple  form  of  an  adaptive  algorithm  for  a  related  problem,  which  motivated 
the  actual  algorithm  to  be  used  here.  Let  yn  be  a  member  of  a  stationary  real-valued  random  sequence  and  (Pn  a 
(stationary)  r-dimensional  random  vector  that  is  correlated  with  it.  The  r-dimensional  vector  9  that  minimizes  the 
mean  square  error  E\yn  —  9'cpn\2  is  9  =  [E(pn(j)n\  1  E(pnyn.  A  stochastic  approximation  algorithm  for  recursively 
estimating  9  is 

dn+l  =  On  +  n  [Vn  ~  <t>'jn ]  >  (2  • 1 ) 

where  e  >  0  is  some  chosen  step  size  parameter. 

Consider,  for  the  moment,  the  special  form  where  yn  =  at(n)Xn-t+ Pru  where  the  { Xn ■  Pn}  are  mutually  in¬ 
dependent  and  each  of  the  Xn  and  pn  are  identically  (in  n)  distributed  and  Epn  =  0.  Define  (pn  =  (yn, . . . ,  Xn-r+ 1). 
If  the  ai{n)  do  not  depend  on  n,  then  9  =  {ao, . . . ,  ar_ i}.  Now  suppose  that  yn  is  still  stationary,  but  the  parameters 
9n  =  {o:o(n), . . . ,  ar_i(n)}  do  vary  with  time.  Then  we  wish  to  track  9n  via  an  algorithm  of  the  form  (2.1).  If  9n 
varies  rapidly,  then  the  step  size  e  should  be  large,  to  allow  tracking.  If  it  varies  slowly,  but  the  variance  of  the  Xn  or 
pn  is  large,  then  e  should  take  a  small  value.  In  general,  not  only  do  we  not  know  what  the  optimal  value  of  e  is,  but 
it  changes  over  time.  We  need  to  adapt  the  value  of  e,  based  on  the  measurements,  and  at  the  same  time  that  9n  is 
being  estimated. 

Consider  the  following  procedure  for  choosing  the  estimates  en  of  the  best  current  value  of  e.  Fix  en  =  e.  Define 
the  error  en(e)  =  yn  —  (p'n^n-  The  scheme  suggested  in  [1,  p.  160]  and  developed  in  [2],  [5]  is  to  find  the  value  of  e 
that  minimizes  the  stationary  value 

E[yn  -  M2/ 2  =  Eel(e)/ 2.  (2.2) 

Formally,  let  Vp  denote  the  “derivative”  d9p/de  for  the  stationary  process.  The  random  variable  9en  is  not  a  classical 
function  of  e,  although  its  distribution  depends  on  e.  But,  it  can  be  shown  [5]  that  the  asymptotic  values  of  the  Vp  as 
used  in  what  follows  can  be  interpreted  as  mean  square  derivatives.  Formally  differentiating  (2.2)  with  respect  to  e, 
assuming  stationarity,  and  setting  the  derivative  to  zero  yields 

0  =  -E[yn  -  (k'j^Vp  =  -Een(e)t'nVp.  (2.3) 

Dropping  the  E  on  the  right  side  of  (2.3)  yields  the  “stochastic  estimate”  — en(e)< p'nVp  of  the  gradient  of  (2.2)  with 
respect  to  e  at  time  n.  Formally  differentiating  (2.1)  with  respect  to  e  yields 

K+l  =  VP  -  zM'nK  +  Mvn  - 

This  discussion  suggests  the  following  algorithm  for  adapting  9  and  e.  Let  //  >  0  he  small  and,  for  en 
use  the  algorithm 

9n-\- 1  —  9n  T  en(pnen, 
en+i  =  max  {0,  en  +  pen(p'nVn)  , 
kn+ 1  —  Vn  ^n^n^nVi  T  (pn[Vn  4>n^'n\'>  V)  —  O' 

The  algorithm  (2.4)  and  (2.5)  has  two  levels.  The  parameter  9n  is  tracked  by  (2.4),  while  the  optimal  step  size 
parameter  e  is  tracked  by  (2.5).  The  actual  stochastic  approximation  algorithm  is  the  one  for  en  in  (2.5),  where  the 
step  size  is  /  / .  The  sequence  { d)ri .  yn .  Vn }  plays  the  role  of  a  driving  noise  process.  The  dependence  of  Vn  on  {en}  and 


—  Vn 

(2.4) 

(2.5) 
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{6n}  is  quite  complicated.  It  is  what  we  call  “state-dependent  noise”  [6].  In  applications,  the  performance  is  a  great 
deal  less  sensitive  to  the  value  of  //  in  (2.4)  than  the  original  algorithm  (2.1)  is  to  the  choice  of  e.  Despite  the  apparent 
formality  in  the  derivation,  the  algorithm  performs  well  with  en  being  nearly  optimal  under  broad  conditions.  It  is  a 
significant  improvement  over  (2.1),  with  its  fixed  value  of  e.  A  proof  of  convergence  and  supporting  simulations  arc 

[5]. 


III.  The  Adaptive  Algorithm  for  a 
For  v'k  =  w,  the  expression  (1.1)  for  Jfc(a,  w)  can  be  written  in  the  recursive  form 

Jk+i(a,w)  =  aJk(a,w)  +  e\(w). 

If  the  values  of  Wk  are  determined  by  the  least  squares  algorithm  (1.2),  then  we  can  write 

Jk+i{a)  =  aJk{a)  +  e|.  (3.1) 


If  the  value  of  a  changes  with  time,  then  we  are  concerned  with  the  discounted  performance  function  EJ where 

Jk+i  =  c^kJk  +  (3-2) 


Suppose,  for  the  moment,  that  the  value  of  a  is  fixed  and  that  { A/,.,  Wf,,  67,.}  is  stationary,  where  uy.  is  determined 
by  (1.2).  The  stationary  distributions  will  depend  on  a.  For  this  process,  define 

4(a)  =  ^Tjak~le2l. 

1=1 


Suppose  that  that  we  wish  to  choose  a  to  minimize  the  stationary  expectation  EJ^(a,  w)  over  a  for  large  k.  The 
time-varying  parameter  tracking  algorithm  of  Section  II  was  based  on  a  “derivative,”  or  “mean-square  derivative.” 
In  our  case,  the  dependence  of  the  algorithm  (1.2)  on  a  is  quite  complicated.  To  avoid  dealing  with  the  rather 
messy  forms  that  would  result  from  a  differentiation  of  the  right  sides  of  (1.2)  with  respect  to  a,  we  simply  work 
with  a  finite  difference  form.  Typically,  the  function  J(a )  =  min,,.  EJ00(a,  w )  is  strictly  convex  and  continuously 
differentiable.  In  our  model,  the  value  increases  sharply  as  a  increases  beyond  its  optimal  value,  and  increases  more 
slowly  as  a  decreases  below  its  optimal  value.  It  is  somewhat  insensitive  to  a  around  the  optimal  value.  Finite 
difference  estimators,  for  the  difference  intervals  that  we  use,  provide  excellent  approximations. 

Let  5  >  0  he  a  small  difference  interval,  let  07.  denote  the  value  of  a  at  the  fcth  update,  and  define  a|.  =  crfc  ±  5/2. 
The  algorithm  (1.2)  is  run  for  both  a^.  Thus,  we  define  the  two  sets  of  recursions,  for  +  and  — : 


fc+i 


sfc+i  =  sk+i  -  [wk]  Xk+1, 

,±  —  7^^  4.  r  ±  e± 

fc+1  —  Wk  m  ^fc+lefc+l> 


W 


T  ± 

-4+1 


PfXk+ 1 


a 


*  +  X'k+1P±Xk+t 


—  _ 

7,  1  1  ^ 


a 


,± 


Pk  + 


a^  +  ^+1^^+1 


(3.3) 


For  small  //  >  0,  the  adaptive  algorithm  for  a  is 


«fc+ 1  =  CXk  —  n~ 


[e 


+  12  _ 
fc+tl 


[4+tl2 


(3.4) 


The  initial  weight  iuq  was  found  using  the  least-squares  solution  (cr  =  1  in  (1.1))  from  an  initial  small  block  of  data. 
The  initial  matrix  !-],  is  just  the  inverse  of  the  sample  covariance  matrix  using  data  from  this  block.  Initial  parameter 
op  F  (0)  1)  is  arbitrarily  chosen. 
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If  a  is  fixed,  and  the  (temporality  assumed)  stationary  distribution  is  used,  then  for  small  5  >  0,  the  (stationary) 
expectation  of  the  coefficient  of  /j  in  (3.4)  should  be  close  to  the  derivative  of  EJp.(a)  with  respect  to  a.  The  intuitive 
idea  behind  the  algorithm  is  that  the  value  of  a  changes  much  more  slowly  than  that  of  w,  so  that  we  are  essentially 
in  the  stationary  state.  In  this  case,  we  clearly  have  a  stochastic  algorithm  driven  by  a  process  whose  values  are 
estimates  of  the  negative  of  a  gradient.  It  turns  out  that  the  idea  can  be  justified,  both  in  simulations  (Section  V)  and 
mathematically.  There  arc  more  equations  to  work  with  in  the  finite  difference  form,  as  opposed  to  the  form  using 
formal  derivatives.  But  the  equations  are  much  simpler.  Various  simpler  stochastic  approximation  forms  were  also 
dealt  with,  and  will  be  discussed  elsewhere. 


Comment  on  the  asymptotic  properties  of  the  algorithm.  Using  results  from  the  theory  of  stochastic  approxima¬ 
tion,  one  can  analyze  the  algorithm  (3.3)  and  (3.4).  The  basic  stochastic  approximation  algorithm  is  (3.4),  since  g  is 
small.  The  quantities  ( V/,.,  vjj: .  sk-  L±.  Pk)  play  the  role  of  noise.  In  this  paper,  we  arc  concerned  with  the  simu¬ 
lations  and  the  presentation  and  motivation  of  the  algorithm.  There  is  little  space  for  dealing  with  the  convergence, 
and  we  confine  ourselves  to  a  few  motivational  remarks.  Owing  to  the  way  that  the  evolution  of  this  noise  is  tied  to 
that  of  ak  we  have  what  is  called  state-dependent  noise  [6,  Chapter  8].  The  asymptotic  behavior  (small  /<)  of  (3.4) 
is  determined  by  a  mean  ODE,  whose  right  hand  side  is  a  “local”  average  of  the  coefficient  of  /t  in  (3.4).  Loosely 
speaking,  since  the  ak  sequence  varies  much  more  slowly  than  do  the  driving  noises,  one  can  compute  this  local 
average  by  assuming  that  ak  is  fixed.  Suppose  that  there  is  a  g(a)  and  m  such  that 


1 

m 


n+m—  1 

E  E< 


l=n 


g(a) 


for  large  n,  m,  and  cq  is  held  fixed  at  a,  and  where  En  is  the  expectation  given  the  data  to  time  n.  Then  the  mean 
ODE  is  a  =  g(-)  [6,  Chapter  8]. 

In  our  case,  the  process  Xk  will  rarely  be  stationary  or  ergodic.  For  example,  see  (4.3)  which  is  summed  over  j, 
i.e.  the  mobiles,  to  give  Xk-  In  (4.3),  the  dominant  effect  is  that  of  the  Doppler  frequencies  uijk.  This  frequency 
will  change  over  time.  But,  over  short  time  intervals,  Xk  varies  rapidly  in  a  periodic  way  and  the  desired  averaging 
will  occur.  Full  details  for  such  problems  are  in  [6,  Chaper  8] ,  and  will  be  explored  in  detail  for  the  current  problem 
elsewhere.  But,  the  overall  conclusion  is  that,  for  small  //,  the  algorithm  for  adapting  a  behaves  as  a  (finite  difference 
approximation  to  a)  gradient  descent  algorithm,  which  is  what  we  were  aiming  for. 


IV.  The  Physical  Model 

In  order  to  keep  the  simulations  simple  and  focus  on  the  essential  issue  of  adaptation,  the  mobiles  move  in  two 
dimensions.  The  three  antennas  are  evenly  spaced  with  spacing  d  >  A/2,  where  A  is  the  carrier  wavelength  (the 
carrier  frequency  is  800  x  106  Hz).  The  sample  (bit)  period  is  h  =  4  x  10-5  sec.  The  number  of  strong  interfering 
mobiles  is  either  one  (IV/  =  1)  or  three  (IV/  =  3).  Their  amplitudes  (i.e.  square  root  of  power)  vary  from  about  1/4 
that  of  the  desired  mobile,  to  being  roughly  equal.  In  the  presented  simulations,  there  is  no  scattering  associated  with 
these  strongly  interfering  mobiles:  i.e.  they  are  in  a  line-of  sight  (LOS)  environment.  However,  we  note  that  when 
scattering  is  added,  the  effects  of  the  adaptation  arc  as  impressive  as  the  LOS  case  and  the  behavior  of  the  adapted  a 
is  much  more  complicated.  We  model  the  effect  of  additional  interferes  with  uniform  scattering  (Rayleigh  fading) 
by  adding  complex-Gaussian  noise  to  each  antenna  which  is  independent  in  time  and  across  the  antennas.  This  noise 
can  be  assumed  to  be  independent  across  the  antenna  elements  since  d  >  A/2.  We  assume  a  median  field  strength 
model  where  the  signal  amplitude  at  the  receiver  from  mobile  j  at  time  k  is  1  /d“j  k,  where  djj-  is  the  distance  to  a 
reference  antenna  in  the  array  [4]. 

We  assume  a  narrowband  signal  (carrier  frequency  3>  signal  bandwidth)  so  the  signal  does  not  change  appreciably 
over  the  time  that  it  takes  to  traverse  the  antenna  array.  The  interferes  are  in  the  far  field  so  their  transmitted 
electromagnetic  wave  can  be  assumed  to  be  a  plane  wave  at  the  antenna  array.  The  pilot  signal  sk,  for  the  tracked  or 
desired  user,  is  assumed  known.  It  is  i.i.d.,  binary  (+1,  -1),  and  is  independent  of  the  signals  from  the  other  mobiles. 
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Fig.  1. 


In  practice,  there  would  be  either  a  training  period  or  reference  signals  sent  periodically,  as  part  of  the  desired  users 
synchronization  signal. 

The  signal  in  each  antenna  is  the  sum  of  those  emanating  from  the  mobiles,  plus  complex  Gaussian  noise.  The 
noise  terms  are  assumed  to  be  mutually  independent,  the  real  and  complex  parts  are  independent,  and  each  has  the 
same  variance  a2.  Define,  as  usual, 

ot,  m  -i  , ,  1  ^  ^ <  i  <  -s 


SINR  =  10  log 


Eil'i  Pi  +  2rr2 


INR  =  10  log 


v -oVr  p 

2^i= l  ri 


where  P(jes  and  Pi,  resp.,  arc  the  signal  powers  (at  the  antenna)  of  the  desired  and  / 1 h  interfering  mobile,  resp.  The 
most  important  factor  in  the  determination  of  the  optimal  value  of  a  at  any  time  is  the  Doppler  shift,  although  the 
values  arc  also  affected  by  the  SINR  and  INR. 

The  Doppler  frequency  of  mobile  j  at  sample  time  k  is 


d  Z7T 

w j,k  =  COS^^fc  ~  lfj,k)  i 


where  yyj.  is  the  angle  of  the  travel  of  mobile  j  (see  Figure  1),  v3±  its  speed,  and  ©.p.  the  angle  of  aiiival  of  its  plane 
wave,  all  at  sample  time  k.  The  spatial  signature  corresponding  to  a  plane  wave  arriving  at  angle  <p  to  the  normal  to 
the  plane  of  the  antennas  (see  Figure  1)  is  given  by  the  column  vector  (antenna  1  is  the  reference  antenna) 

c=  1,  exp  i  —  d  sin<^  ,  exp  —2d  sin ^  ,  (4-1) 

where  A  is  the  carrier  wavelength.  We  denote  the  spatial  signature  corresponding  to  mobile  j  at  time  k  by  c}}-  where 
<j>jjk  is  used  in  (4. 1).  The  component  of  the  received  signal  at  the  antenna  array  at  sample  time  k  and  which  is  due  to 
mobile  j  is  given  by 

=  pi  'A/A  exP  \  i  jth  )  cyfc,  (4.3) 

aj,k  V  Li=i  "  \) 

where  h  is  the  time  interval,  between  updates  (which  is  4  x  10-5  seconds  in  our  simulations).  The  (complex)  signal 
received  by  the  array  at  sample  time  k  is  Xj;  =  J2j  %j,k-  Of  particular  interest  is  the  case  where  the  wave  number 
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2ir/\  is  very  large  so  that  small  variations  in  the  mobility  of  the  mobile  can  lead  to  large  changes  in  the  Doppler 
frequency.  The  signal  X/,.  can  be  based  either  on  TDMA  or  CDMA.  In  the  latter  case,  it  is  measured  after  the  matched 
filters  (which  use  the  signature  of  the  desired  user). 

Note  that  the  model  is  close  to  what  was  used  in  [10],  except  that  we  allow  somewhat  larger  Doppler  frequencies. 

V.  Discussion  of  the  Simulations 

We  will  describe  the  performance  of  the  algorithm  for  adapting  a,  via  a  set  of  simulations.  Unless  otherwise 
noted,  the  direction  and  velocity  of  each  mobile  evolved  as  a  semi-Markov  process,  each  moving  independently  of 
the  others.  They  were  constant  for  a  short  random  interval,  then  there  was  sudden  acceleration  or  deceleration  in 
each  coordinate,  and  so  forth.  In  the  plots,  only  the  associated  piecewise  constant  Doppler  shifts  are  given,  since  that 
is  the  most  important  factor  in  the  adaptation  of  a.  The  simulations  start  with  two  “baseline”  sets  of  data,  one  for 
Nj  =  1  and  one  for  Nj  =  3.  Then  the  data  is  varied  to  explore  the  behavior  under  a  variety  of  operating  situations. 
For  Nj  =  1,  the  baseline  is  SINR  =  5.3db,  INR  =1.3  db,  and  for  Nj  =  3  it  is  SINR=  0.5db,  INR  =  7.9db.  Mobile 
2  is  always  the  desired  one.  In  each  baseline  case,  the  signal  amplitude  at  the  receiver  of  the  interfering  mobiles 
are  approximately  the  same,  and  were  approximately  one  fourth  that  of  the  desired  mobile.  This  represents  a  large 
interference,  especially  with  CDMA.  The  given  SINR  and  INR  values  arc  only  approximations,  since  the  signal 
strength  changes  over  the  simulation  interval;  however,  these  changes  arc  relatively  small.  We  used  //  =  .0008  and 
6  =  .002.  Changing  the  value  of  //  up  or  down  by  factor  of  four  had  little  effect  on  the  overall  performance. 

Starting  with  each  baseline  case  (described  below),  the  data  was  varied  systematically,  as  follows. 

1 .  Interferer(s)  are  moved  closer  to  the  antennas.  This  results  in  an  increased  INR  and  a  decreased  SINR.  The 
interferes  are  such  that  their  power  at  the  antennas  is  approximately  the  same  as  that  of  the  desired  mobile. 

2.  Interferer(s)  are  moved  further  away  from  the  antennas.  This  results  in  a  decreased  INR  and  an  increased 
SINR. 

3.  Increased  variance  of  the  additive  complex  noise.  This  corresponds  to  increasing  the  number  of  scattering 
sources.  It  results  in  a  decreased  SINR  and  a  decreased  INR. 

4.  Decreased  variance  of  the  additive  complex  noise.  This  results  in  an  increased  SINR  and  an  increased  INR. 

5.  Other  mobility  models.  To  get  a  better  idea  of  the  behavior  of  the  adaptive  algorithm,  models  where  the  Doppler 
frequency  moved  either  in  a  “straight-line”  or  zigzaged  were  simulated. 

We  will  see  that  a-/,  “tracks”  the  Doppler  frequency  of  the  desired  user  in  all  cases,  and  that  there  is  a  significant 
improvement  in  the  performance,  over  that  corresponding  to  the  use  of  constant  values  of  a.  The  performance  is 
much  less  sensitive  to  the  value  of  /t  than  it  is  to  the  value  of  a,  which  supports  the  conclusions  in  [5] . 

A.  Baseline  Cases 

In  all  cases,  mobile  2  is  the  desired  one.  First  we  note  that  if  the  mobiles  arc  stationary,  where  a  =  1  is  optimal, 
then  the  values  of  ap-  in  the  adaptive  procedure  remained  close  to  unity,  and  the  sample  mean  square  errors  were 
virtually  indistinguishable  with  those  for  the  case  a/.  =  1. 

The  behavior  of  the  adapted  o-proccss  as  well  as  of  the  Doppler  frequencies  of  a  typical  simulation  for  the  baseline 
case  with  Nj  =  1  arc  in  Figure  2.  Note  that  the  Doppler  shift  is  the  vertical  scale  times  104.  Mobile  2  starts  with  a 
high  Doppler  frequency  (corresponding  to  a  velocity  of  approximately  150  km/hr),  which  then  decreases  suddenly  at 
t  =  .6  sec.,  then  decreases  more  slowly,  and  finally  increases  slightly.  The  behavior  of  a  is  typical  for  all  simulations 
with  this  mobility  for  mobile  2.  It  initially  oscillates  about  a  =  .85,  which  is  very  close  to  the  optimal  value  for 
the  associated  Doppler  shift.  Then,  when  the  Doppler  frequency  drops  to  about  1.5  x  104,  a  increases  quickly, 
and  then  continues  to  increase  (on  the  average)  as  the  Doppler  frequency  continues  to  drop.  At  t  =  1,  the  Doppler 
frequency  rises  slightly  and  then  remains  constant.  Except  for  the  brief  transient  periods,  the  values  of  a  arc  close 
to  the  optimal.  When  smaller  /_/  is  used,  the  paths  of  a  arc  smoother,  the  transient  period  longer,  but  the  overall 
performance  is  very  similar.  Note  that  the  behavior  of  the  Doppler  frequency  of  the  interfering  mobile  had  negligible 
affect  on  a. 
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Figure  3  plots  the  running  mean  squared  error  (or  moving  average  [MA])  cost  for  the  adapted  algorithm  together 
with  those  for  the  algorithm  with  constant  values  of  a.  At  time  k  the  MA  cost  is  given  by 

1  k 

K  1=1 

In  our  simulations  we  do  not  model  the  detection  since  we  arc  focusing  simply  on  the  benefit  of  the  adapt  ive-o 
algorithm.  But,  we  note  that  in  general  e/2  >  0  when  there  is  perfect  detection  so  that  our  MA  costs  may  seem  high. 
The  important  point  is  the  relative  performance  between  the  const  ant- a  and  adaptive-o  algorithms.  In  Figure  3,  the 
constant  value  a  =  .84  gives  results  that  arc  very  close  to  the  optimum.  But,  with  other  system  data  the  best  constant 
values  arc  different,  and  the  cost  might  still  be  significantly  larger  than  for  that  of  the  adaptive  algorithm.  The 
use  of  constant  values  of  a  never  outperformed  the  adaptive  algorithm.  Except  for  the  cases  of  very  high  Doppler 
frequencies,  the  performance  was  approximately  the  same  if  blind  adaptation  were  used,  with  the  pilot  signal  being 
used  only  for  initialization. 

Figure  4  gives  the  adaptation  process  for  the  baseline  case  Nj  =  3.  The  results  arc  similar,  despite  the  fact  that  the 
number  of  mobiles  is  greater  than  the  number  of  antenna  elements.  Again,  the  behavior  of  the  interfering  mobiles 
had  litde  effect  on  the  evolution  of  a. 

When  the  additive  noise  is  increased,  the  optimal  value  of  a  increases.  See  Figure  5,  where  SINR=-1.5  and  INR 
=  -8.7db,  which  represents  a  large  increase  in  the  noise.  The  wilder  behavior  of  a  is  due  to  the  larger  noise.  There  is 
a  fairly  short  term  memory  in  these  algorithms,  so  the  randomness  in  the  noise  sequence  has  a  significant  affect  on 
a.  The  behavior  is  smoother  if  smaller  //  or  larger  S  is  used.  But,  in  all  cases,  the  adaptive  algorithm  outperformed 
the  constant  a  forms,  sometimes  significantly  (see  Figure  6).  If  the  variance  of  the  additive  noise  is  decreased, 
then  the  optimal  values  of  a  decrease,  and  the  adaptive  algorithm  still  kept  close  to  the  optimal  value.  When  the 
noise  variance  is  smaller,  the  other  properties  of  the  paths  of  the  desired  and  interfering  mobiles  play  a  greater  role, 
although  the  dominant  influence  is  still  the  Doppler  frequency  of  the  desired  mobile. 

Moving  the  interfering  mobiles  further  out  or  closer  in  had  little  effect  on  the  paths  of  the  adapted  a,  although  the 
paths  jumped  about  the  mean  values  somewhat  more  for  the  smaller  INR  cases.  For  Nj  =  3,  and  the  mobiles  further 
out,  SINR=7.1,  INR=-2.4db.  A  typical  plot  of  the  a-patli  is  in  Figure  7. 

For  an  example  where  the  Doppler  frequency  of  the  desired  mobile  zigzagged  in  a  saw-tooth  fashion,  see  Figure 
8,  for  the  case  Nj  =  3.  The  optimal  value  of  a  also  varies  in  a  saw-tooth  fashion,  and  the  adaptive  algorithm  tracks 
them  very  well.  For  another  example,  where  the  Doppler  frequency  of  the  desired  mobile  is  linearly  decreasing,  see 
Figure  9  for  the  case  Nj  =  1.  The  adaptive  algorithm  tracks  very  well.  These  examples  also  illustrate  the  property 
that  the  cost  is  better  (sometimes  much  better)  than  the  alternative  of  using  any  fixed  value  of  a  (see  Figure  10  for 
the  saw-toothed  example). 
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